Identifying archetypal perspectives in news articles
نویسندگان
چکیده
A novel approach to news aggregation is proposed. Rather than ranking or summarisation of cluster topics, we propose that articles are grouped by topic similarity and then clustered within topic groups in order to identify archetypal articles that represent the various perspectives upon a topic. An example application is examined and a preliminary user study is discussed. Future applications and evaluation of validity are outlined.
منابع مشابه
Searching for Diverse Perspectives in News Articles: Using an LSTM Network to Classify Sentiment
When searching for emerging news on named entities, many users wish to find articles containing a variety of perspectives. Advances in sentiment analysis, particularly by tools that use Recurrent Neural Networks (RNNs), have made impressive gains in their accuracy handling NLP tasks such as sentiment analysis. Here we describe and implement a special type of RNN called a Long Short Term Memory ...
متن کاملIdentifying News Broadcasters’ Ideological Perspectives Using a Large-Scale Video Ontology
Television news has been the predominant way of understanding the world around us, but individual news broadcasters can frame or mislead audience’s understanding about political and social issues. We aim to develop a computer system that can automatically identify highly biased television news, which may prompt audience to seek news stories from contrasting viewpoints. But can computers determi...
متن کاملNamed Entity Oriented Difference Analysis of News Articles and Its Application
To support the efficient gathering of diverse information about a news event, we focus on descriptions of named entities (persons, organizations, locations) in news articles. We extend the stakeholder mining proposed by Ogawa et al. and extract descriptions of named entities in articles. We propose three measures (difference in opinion, difference in details, and difference in factor coverage) ...
متن کاملIdentification in a Text Corpus
TopCat (Topic Categories) is a technique for identifying topics that recur in articles in a text corpus. Natural language processing techniques are used to identify key entities in individual articles, allowing us to represent an article as a set of items. This allows us to view the problem in a database/data mining context: Identifying related groups of items. This paper presents a novel metho...
متن کاملArabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents
Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...
متن کامل